AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

Optimization Can Learn Johnson Lindenstrauss Embeddings

Neural Information Processing SystemsMay-30-2025, 03:26:55 GMT

Embeddings play a pivotal role across various disciplines, offering compact representations of complex data structures. Randomized methods like Johnson-Lindenstrauss (JL) provide state-of-the-art and essentially unimprovable theoretical guarantees for achieving such representations. These guarantees are worst-case and in particular, neither the analysis, nor the algorithm, takes into account any potential structural information of the data. The natural question is: must we randomize? Could we instead use an optimization-based approach, working directly with the data?

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Europe (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Multi-Stage Predict+Optimize for (Mixed Integer) Linear Programs

Neural Information Processing SystemsMay-30-2025, 03:26:32 GMT

The recently-proposed framework of Predict+Optimize tackles optimization problems with parameters that are unknown at solving time, in a supervised learning setting. Prior frameworks consider only the scenario where all unknown parameters are (eventually) revealed at the same time. In this work, we propose Multi-Stage Predict+Optimize, a novel extension catering to applications where unknown parameters are instead revealed in sequential stages, with optimization decisions made in between. We further develop three training algorithms for neural networks (NNs) for our framework as proof of concept, all of which can handle mixed integer linear programs. The first baseline algorithm is a natural extension of prior work, training a single NN which makes a single prediction of unknown parameters.

artificial intelligence, baseline, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin (0.14)
North America > United States > California (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Banking & Finance (1.00)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Appendix for: Invertible Gaussian Reparameterization

Neural Information Processing SystemsMay-30-2025, 03:08:48 GMT

As mentioned in section 3.1, we can use the matrix determinant lemma to efficiently compute the determinant of the Jacobian of the softmax Proof: For k = 1,..., K 1, we have: P(H = k) = Note that the involved integrals are one-dimensional and thus can be accurately approximated with quadrature methods. As mentioned in the main manuscript, our VAE experiments closely follow Maddison et al. [4]: we use the same continuous objective and the same evaluation metrics. Using the former KL results in optimizing a continuous objective which is not a log-likelihood lower bound anymore, which is mainly why we followed Maddison et al. [4]. In addition to the reported comparisons in the main manuscript, we include further comparisons in Table 1 reporting the discretized training ELBO instead. These are variance reduction techniques which heavily lean on the GS to improve the variance of the obtained gradients.

approximation, artificial intelligence, machine learning, (13 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Add feedback

Leveraging Tumor Heterogeneity: Heterogeneous Graph Representation Learning for Cancer Survival Prediction in Whole Slide Images

Neural Information Processing SystemsMay-30-2025, 02:28:30 GMT

Survival prediction is a significant challenge in cancer management. Tumor microenvironment is a highly sophisticated ecosystem consisting of cancer cells, immune cells, endothelial cells, fibroblasts, nerves and extracellular matrix.

category, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area > Oncology > Carcinoma (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
(3 more...)

Add feedback

The Sample Complexity of Gradient Descent in Stochastic Convex Optimization

Neural Information Processing SystemsMay-30-2025, 02:26:33 GMT

We analyze the sample complexity of full-batch Gradient Descent (GD) in the setup of non-smooth Stochastic Convex Optimization.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.71)

Add feedback

SOFTS: Efficient Multivariate Time Series Forecasting with Series-Core Fusion

Neural Information Processing SystemsMay-30-2025, 02:11:45 GMT

Multivariate time series forecasting plays a crucial role in various fields such as finance, traffic management, energy, and healthcare. Recent studies have highlighted the advantages of channel independence to resist distribution drift but neglect channel correlations, limiting further enhancements. Several methods utilize mechanisms like attention or mixer to address this by capturing channel correlations, but they either introduce excessive complexity or rely too heavily on the correlation to achieve satisfactory results under distribution drifts, particularly with a large number of channels. Addressing this gap, this paper presents an efficient MLP-based model, the Series-cOre Fused Time Series forecaster (SOFTS), which incorporates a novel STar Aggregate-Redistribute (STAR) module. Unlike traditional approaches that manage channel interactions through distributed structures, e.g., attention, STAR employs a centralized strategy to improve efficiency and reduce reliance on the quality of each channel. It aggregates all series to form a global core representation, which is then dispatched and fused with individual series representations to facilitate channel interactions effectively. SOFTS achieves superior performance over existing state-of-the-art methods with only linear complexity. The broad applicability of the STAR module across different forecasting models is also demonstrated empirically.

data mining, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (0.67)
Energy > Power Industry (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Generalization error in high-dimensional perceptrons: Approaching Bayes error with convex optimization Benjamin Aubin

Neural Information Processing SystemsMay-30-2025, 02:11:05 GMT

We consider a commonly studied supervised classification of a synthetic dataset whose labels are generated by feeding a one-layer neural network with random i.i.d inputs. We study the generalization performances of standard classifiers in the high-dimensional regime where α = n/d is kept finite in the limit of a high dimension d and number of samples n.

artificial intelligence, generalization error, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.68)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.30)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.41)

Add feedback

BMRS: Bayesian Model Reduction for Structured Pruning

Neural Information Processing SystemsMay-30-2025, 02:10:31 GMT

Modern neural networks are often massively overparameterized leading to high compute costs during training and at inference. One effective method to improve both the compute and energy efficiency of neural networks while maintaining good performance is structured pruning, where full network structures (e.g.

artificial intelligence, machine learning, pruning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom (0.14)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)

Add feedback

Differentially Private Optimization with Sparse Gradients

Neural Information Processing SystemsMay-30-2025, 01:12:56 GMT

Motivated by applications of large embedding models, we study differentially private (DP) optimization problems under sparsity of individual gradients. We start with new near-optimal bounds for the classic mean estimation problem but with sparse data, improving upon existing algorithms particularly for the highdimensional regime. The corresponding lower bounds are based on a novel blockdiagonal construction that is combined with existing DP mean estimation lower bounds. Next, we obtain pure-and approximate-DP algorithms with almost optimal rates for stochastic convex optimization with sparse gradients; the former represents the first nearly dimension-independent rates for this problem. Furthermore, by introducing novel analyses of bias reduction in mean estimation and randomly-stopped biased SGD we obtain nearly dimension-independent rates for near-stationary points for the empirical risk in nonconvex settings under approximate-DP.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Semi-Open 3D Object Retrieval via Hierarchical Equilibrium on Hypergraph

Neural Information Processing SystemsMay-30-2025, 01:07:08 GMT

Existing open-set learning methods consider only the single-layer labels of objects and strictly assume no overlap between the training and testing sets, leading to contradictory optimization for superposed categories. In this paper, we introduce a more practical Semi-Open Environment setting for open-set 3D object retrieval with hierarchical labels, in which the training and testing set share a partial label space for coarse categories but are completely disjoint from fine categories. We propose the Hypergraph-Based Hierarchical Equilibrium Representation (HERT) framework for this task. Specifically, we propose the Hierarchical Retrace Embedding (HRE) module to overcome the global disequilibrium of unseen categories by fully leveraging the multi-level category information. Besides, tackling the feature overlap and class confusion problem, we perform the Structured Equilibrium Tuning (SET) module to utilize more equilibrial correlations among objects and generalize to unseen categories, by constructing a superposed hypergraph based on the local coherent and global entangled correlations. Furthermore, we generate four semi-open 3DOR datasets with multi-level labels for benchmarking. Results demonstrate that the proposed method can effectively generate the hierarchical embeddings of 3D objects and generalize them towards semi-open environments.

artificial intelligence, category, machine learning, (15 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.46)

Technology: